It is desirable in many applications to enhance the intelligibility of speech when the speech has been processed electronically as, for example, in hearing aids, public address systems, radio or telephone communications, and the like. Although it is helpful to enhance the presentation of both vowel and consonant sounds, generally it appears that, since the intelligibility characteristics of speech depend to such a significant extent on consonant sounds, it is primarily desirable to enhance the intelligibility of such consonants.
Several approaches have characterized recent research into such intelligibility problems, particularly with respect to the hearing aid field. One approach has been to take the high frequency sounds in speech and transpose them to lower frequencies so that they fall within the band of normal hearing acuity, leaving the low frequency sounds unprocessed. Such approaches are discussed, for example, in the article "A Critical Review of Work on Speech Analyzing Hearing Aids" by A. Risberg, IEEE Trans. Audio and Electroacoustics, Vol. AU-17. No. 4, December 1969, pp. 290-297. The degree of success of such an approach appears to be quite limited and overall improvement in perceiving consonants, for example, was relatively small.
An alternate approach, akin to the frequency lowering technique, has been to slow down the overall speech, i.e., to lower the frequencies of the overall speech waveform thereby presenting the higher frequency content at lower frequencies within the listener's normal hearing band. If such a technique is used in real time, segments of the speech have to be removed in order to make room for the remaining temporally expanded segments and such process can generate distortion in the speech. Such techniques are discussed in the article "Moderate Frequency Compression for the Moderately Hearing Impaired", M. Mazor et al., J. Acoust. Soc. Am., Vol. 62, No. 5, November 1977, pp. 1273-1278. Although some slight improvement has been observed using such frequency compression techniques for up to about 20% frequency compression, for example, it was also noted that a further increase in frequency compression only tended to reduce intelligibility.
A basic problem with both high frequency transposition techniques and frequency compression schemes is that they tend to distort the temporal-frequency patterns of speech. Such distortion interferes with the cues needed by the listener to perceive the speech features. As a result such approaches tend to meet with only limited success in enhancing speech intelligibility.
Another approach to speech intelligibility enhancement is one which preserves the bandwidth of the speech and, instead, modifies the level and dynamic range of the speech waveform. The goal of such a speech processing approach is to make full use of the listener's high frequency hearing abilities. The hearing abilities of the hearing impaired are described, for example, in the article, "Differences in Loudness Response of the Normal and Hard of Hearing Ear at Intensity Levels Slightly above Threshold", by S. Reger, Ann. Otol., Rhinol., and Laryngol., Vol. 45, 1936, pp. 1029-1036. In this study of hearing impairment it was noted that soft sounds could not be perceived because of the loss in sensitivity, but that more intense sounds were perceived as having near-normal loudness. This phenomenon, sometimes referred to as "recruitment", has formed a motivation for improved hearing aid designs. Thus, an approach that tends to preserve the speech bandwidth and improves intelligibility by modifying the speech waveform dynamics and spectral energy appears to be a more effective approach than frequency transposition or frequency compression techniques because the features of the speech are better preserved. Although such an approach has achieved some success, as reported in the article "Signal Processing to Improve Speech Intelligibility for the Hearing Impaired" by E. Villchur, J. Acoust. Soc. Am., Vol. 53, pp. 1646-1657, June 1973, improvement is still needed to provide the most effective enhancement of the intelligibility of speech, particularly in the enhancement of consonant sounds.